48 research outputs found
Proximal Bellman mappings for reinforcement learning and their application to robust adaptive filtering
This paper aims at the algorithmic/theoretical core of reinforcement learning
(RL) by introducing the novel class of proximal Bellman mappings. These
mappings are defined in reproducing kernel Hilbert spaces (RKHSs), to benefit
from the rich approximation properties and inner product of RKHSs, they are
shown to belong to the powerful Hilbertian family of (firmly) nonexpansive
mappings, regardless of the values of their discount factors, and possess ample
degrees of design freedom to even reproduce attributes of the classical Bellman
mappings and to pave the way for novel RL designs. An approximate
policy-iteration scheme is built on the proposed class of mappings to solve the
problem of selecting online, at every time instance, the "optimal" exponent
in a -norm loss to combat outliers in linear adaptive filtering, without
training data and any knowledge on the statistical properties of the outliers.
Numerical tests on synthetic data showcase the superior performance of the
proposed framework over several non-RL and kernel-based RL schemes.Comment: arXiv admin note: text overlap with arXiv:2210.1175
Online dictionary learning from big data using accelerated stochastic approximation algorithms
Applications involving large-scale dictionary learning tasks motivate well online optimization algorithms for generally non-convex and non-smooth problems. In this big data context, the present paper develops an online learning framework by jointly leveraging the stochastic approximation paradigm with first-order acceleration schemes. The generally non-convex objective evaluated online at the resultant iterates enjoys quadratic rate of convergence. The generality
of the novel approach is demonstrated in two online learning applications: (i) Online linear regression using the total least-squares approach; and, (ii) a semi-supervised dictionary learning approach
to network-wide link load tracking and imputation of real data with missing entries. In both cases, numerical tests highlight the potential of the proposed online framework for big data network analytics
Trading off communications bandwidth with accuracy in adaptive diffusion networks
In this paper, a novel algorithm for bandwidth reduction in adaptive distributed learning is introduced. We deal with diffusion networks, in which the nodes cooperate with each other, by exchanging information, in order to estimate an unknown parameter vector of interest. We seek for solutions in the framework of set theoretic estimation. Moreover, in order to reduce the required bandwidth by the transmitted information, which is dictated by the dimension of the unknown vector, we choose to project and work in a lower dimension Krylov subspace. This provides the benefit of trading off dimensionality with accuracy. Full convergence properties are presented, and experiments, within the system identification task, demonstrate the robustness of the algorithmic technique
A Sparsity-Aware Adaptive Algorithm for Distributed Learning
In this paper, a sparsity-aware adaptive algorithm for distributed learning
in diffusion networks is developed. The algorithm follows the set-theoretic
estimation rationale. At each time instance and at each node of the network, a
closed convex set, known as property set, is constructed based on the received
measurements; this defines the region in which the solution is searched for. In
this paper, the property sets take the form of hyperslabs. The goal is to find
a point that belongs to the intersection of these hyperslabs. To this end,
sparsity encouraging variable metric projections onto the hyperslabs have been
adopted. Moreover, sparsity is also imposed by employing variable metric
projections onto weighted balls. A combine adapt cooperation strategy
is adopted. Under some mild assumptions, the scheme enjoys monotonicity,
asymptotic optimality and strong convergence to a point that lies in the
consensus subspace. Finally, numerical examples verify the validity of the
proposed scheme, compared to other algorithms, which have been developed in the
context of sparse adaptive learning